Parallel Algorithms for Top-k Query Processing

نویسندگان

Neil Zhenqiang Gong

Guang-Zhong Sun

Dongqu Chen

چکیده

The general problem of answering top-k queries can be modeled using lists of objects sorted by their local scores. Fagin et al. proposed the “middleware cost” for a top-k query algorithm, and proposed the efficient sequential Threshold Algorithm (TA). However, since the size of the dataset can be incredible huge, the middleware cost of sequential TA may be intolerable. So, in this paper, we propose parallel algorithms to process top-k queries and analyze their middleware costs. Intuitively, a naive parallel algorithm, called PTA (parallel-TA), evenly partitions the original dataset into P (the number of processors) subdatasets. Each processor finds top-k results of one corresponding subdataset using TA algorithm. Then these results are merged to get the final top-k answers. Motivated by the idea of partitioning objects, we take a further step to partition D into n subdatasets according to their degree of domination. Based on this partition, we propose EPTA (Enhanced-PTA) algorithm. Under PRAM-CRCW model, the middleware cost of PTA is 2 ( / ) O nm P while the average middleware cost of EPTA is 2 -1 ( (ln ) / ( -1)!) m O km n m under the assumption that scores in different lists are independently distributed, where n is the dataset size and m is the number of lists. Extensive experiments show that the speedup ratios of EPTA are significantly higher than those of PTA.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel Probing of Web Databases for Top-k Query Processing

A “top-k query” specifies a set of preferred values for the attributes of a relation and expects as a result the k objects that are “closest” to the given preferences according to some distance function. In many web applications, the relation attributes are only available via probes to autonomous webaccessible sources. Probing these sources sequentially to process a top-k query is inefficient, ...

متن کامل

As-Soon-As-Possible Top-k Query Processing in P2P Systems

Top-k query processing techniques provide two main advantages for unstructured peer-to-peer (P2P) systems. First they avoid overwhelming users with too many results. Second they reduce significantly network resources consumption. However, existing approaches suffer from long waiting times. This is because top-k results are returned only when all queried peers have finished processing the query....

متن کامل

eSPAK: Top-K Spatial Keyword Query Processing in Directed Road Networks

Given a query location and a set of query keywords, a top-k spatial keyword query rank objects based on the distance to the query location and textual relevance to the query keywords. Several solutions have been proposed for top-k spatial keyword queries in Euclidean space. However, few algorithms study top-k keyword queries in undirected road networks where every road segment is undirected. Ev...

متن کامل

Implementation of the direction of arrival estimation algorithms by means of GPU-parallel processing in the Kuda environment (Research Article)

Direction-of-arrival (DOA) estimation of audio signals is critical in different areas, including electronic war, sonar, etc. The beamforming methods like Minimum Variance Distortionless Response (MVDR), Delay-and-Sum (DAS), and subspace-based Multiple Signal Classification (MUSIC) are the most known DOA estimation techniques. The mentioned methods have high computational complexity. Hence using...

متن کامل

A Detailed Evaluation of Threshold Algorithms for Answering Top-k queries in Peer-to-Peer Networks

Ranking queries, also known as top-k queries, have drawn considerable attention due to their usability in various applications. Several algorithms have been proposed for the evaluation of top-k queries. A large percentage of them follow the Threshold Approach. In p2p networks, top-k query processing can provide a lot of advantages both in time and bandwidth consumption. We focus on the main ada...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Parallel Algorithms for Top-k Query Processing

نویسندگان

چکیده

منابع مشابه

Parallel Probing of Web Databases for Top-k Query Processing

As-Soon-As-Possible Top-k Query Processing in P2P Systems

eSPAK: Top-K Spatial Keyword Query Processing in Directed Road Networks

Implementation of the direction of arrival estimation algorithms by means of GPU-parallel processing in the Kuda environment (Research Article)

A Detailed Evaluation of Threshold Algorithms for Answering Top-k queries in Peer-to-Peer Networks

عنوان ژورنال:

اشتراک گذاری